-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Consider secondary storage selectors during cold volume migration #10957
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider secondary storage selectors during cold volume migration #10957
Conversation
|
@blueorangutan package |
|
@winterhazel a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## 4.20 #10957 +/- ##
============================================
- Coverage 16.14% 16.14% -0.01%
- Complexity 13253 13255 +2
============================================
Files 5656 5656
Lines 497893 497897 +4
Branches 60374 60375 +1
============================================
- Hits 80405 80401 -4
- Misses 408529 408536 +7
- Partials 8959 8960 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 13604 |
|
@blueorangutan package |
|
@weizhouapache a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✖️ el8 ✖️ el9 ✔️ debian ✖️ suse15. SL-JID 14956 |
|
@blueorangutan package |
|
@weizhouapache a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 14969 |
|
@blueorangutan package |
|
@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
@winterhazel , is this still relevant for you? (do we need to push through on this?) |
@DaanHoogland yup, still relevant. Would be nice having this one merged. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✖️ debian ✔️ suse15. SL-JID 16022 |
DaanHoogland
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clgtm
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16029 |
WIPTC1: Cold Volume Migration Without Secondary Storage SelectorObjective Test Steps
Expected Result:
Actual Result:
Test Evidence: Pre-migration check - No VOLUME selector exists: Secondary storage capacity (all equal): Migration command and result: Management server log showing secondary storage selection: Status: PASSED |
|
@blueorangutan package |
|
@RosiKyu a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress. |
|
Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 16539 |
|
Hey @RosiKyu, thanks for your tests! I would just like to point out that the JS interpreter is not working as intended at the current moment (see #12515). Hence, selectors that choose a secondary storage based on information about the volume/account/domain/existing secondary storages will not work as expected. You can, however, test this PR by using a simple rule that directs all volumes to a specific secondary storage, for instance: (admin) 🐱 > create secondarystorageselector name="direct volumes to secondary storage X" description="directs volumes to secondary storage X" zoneid=13b319e9-108c-4925-96aa-ae556d9a11b2 heuristicrule="'<uuid-of-secondary-storage-X>'" type=VOLUMEWith this selector, all volumes will pass through secondary storage X during cold migration. |
Thanks @winterhazel for the clarification! I was hitting exactly that issue - when enabling Good to know PR #12515 addresses this. I'll proceed with testing PR #10957 using the simple rule workaround you suggested: |
|
@RosiKyu I think the issue you are facing is not related to #12515. Instead, it may be happening because the value of Could you check if the following resolves your issue?
cat /etc/cloudstack/management/key
java -classpath /usr/share/cloudstack-common/lib/cloudstack-utils.jar com.cloud.utils.crypt.EncryptionCLI -p <key of the management server> -i true
mysql -u root -p cloud -e "UPDATE configuration SET value='<result of the previous command>' WHERE name='js.interpretation.enabled';"
|
or, just copy the value of configurtion "init" |
RosiKyu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Tested cold volume migration with and without secondary storage selectors on KVM with NFS storage.
| Test Case | Description | Result |
|---|---|---|
| TC1 | Cold volume migration WITHOUT selector (default fallback) | PASSED |
| TC2 | Cold volume migration WITH VOLUME type selector | PASSED |
TC1: Cold Volume Migration Without Secondary Storage Selector
Objective
Verify that cold volume migration uses the secondary storage with the most free capacity when no secondary storage selector (heuristic rule) is configured for VOLUME type.
Test Steps
- Verified no secondary storage selector exists for VOLUME type in the zone
- Confirmed three secondary storages exist with equal capacity (~1.08 TB free each)
- Started tail on management-server.log to capture secondary storage selection
- Executed cold migration of test-vol-1 from primary storage pri2 to pri1
- Verified migration completed successfully and observed log output
Expected Result:
- Migration should complete successfully
- Log should show: "Secondary storage selector did not direct volume migration to a specific secondary storage; using secondary storage with the most free capacity."
- System should select sec1 (first storage with most/equal free capacity)
Actual Result:
- Migration completed successfully ✓
- Volume test-vol-1 migrated from pri2 to pri1 ✓
- Log confirmed: "Secondary storage selector did not direct volume migration to a specific secondary storage; using secondary storage with the most free capacity." ✓
- System used sec1 (id:1, uuid: 27cfbb80-8eda-4403-8ac8-9572c7edd2b7) as staging storage ✓
Test Evidence:
Pre-migration check - No VOLUME selector exists:
(localcloud) 🐱 > list secondarystorageselectors zoneid=0ec45e01-e0b0-4fbf-a6e8-7bb81dd480e2 type=VOLUME
(localcloud) 🐱 >
Secondary storage capacity (all equal):
(localcloud) 🐱 > list imageStores zoneid=0ec45e01-e0b0-4fbf-a6e8-7bb81dd480e2
{
"count": 3,
"imagestore": [
{
"disksizetotal": 2898029182976,
"disksizeused": 1707413078016,
"id": "27cfbb80-8eda-4403-8ac8-9572c7edd2b7",
"name": "NFS://10.0.32.4/acs/secondary/ref-trl-10723-k-Mol9-rositsa-kyuchukova/ref-trl-10723-k-Mol9-rositsa-kyuchukova-sec1",
"protocol": "nfs"
},
{
"disksizetotal": 2898029182976,
"disksizeused": 1707413078016,
"id": "6062156f-b9f3-417a-bdd9-202fc5bce258",
"name": "NFS://10.0.32.4/acs/secondary/ref-trl-10723-k-Mol9-rositsa-kyuchukova/ref-trl-10723-k-Mol9-rositsa-kyuchukova-sec2",
"protocol": "nfs"
},
{
"disksizetotal": 2898029182976,
"disksizeused": 1707413078016,
"id": "910cc03a-0b37-4a4d-a3f6-968cff43c3c2",
"name": "NFS://10.0.32.4/acs/secondary/ref-trl-10723-k-Mol9-rositsa-kyuchukova/ref-trl-10723-k-Mol9-rositsa-kyuchukova-sec3",
"protocol": "nfs"
}
]
}
Migration command and result:
(localcloud) 🐱 > migrate volume volumeid=08caf0fa-ba34-43aa-addd-f415854c854a storageid=f55f0783-0351-329c-bd4f-8a9b81e4acbd
{
"volume": {
"id": "08caf0fa-ba34-43aa-addd-f415854c854a",
"name": "test-vol-1",
"state": "Ready",
"storage": "ref-trl-10723-k-Mol9-rositsa-kyuchukova-kvm-pri1",
"storageid": "f55f0783-0351-329c-bd4f-8a9b81e4acbd"
}
}
Management server log showing secondary storage selection:
2026-01-26 16:59:21,409 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Work-Job-Executor-7:[ctx-ae1d605d, job-46/job-47, ctx-cd5b4539]) (logid:99a6f633) copyAsync inspecting src type VOLUME copyAsync inspecting dest type VOLUME
2026-01-26 16:59:21,409 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Work-Job-Executor-7:[ctx-ae1d605d, job-46/job-47, ctx-cd5b4539]) (logid:99a6f633) About to MIGRATE copy between datasources
2026-01-26 16:59:21,410 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Work-Job-Executor-7:[ctx-ae1d605d, job-46/job-47, ctx-cd5b4539]) (logid:99a6f633) MIGRATE copy using copyVolumeBetweenPools STARTING
2026-01-26 16:59:21,414 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Work-Job-Executor-7:[ctx-ae1d605d, job-46/job-47, ctx-cd5b4539]) (logid:99a6f633) Secondary storage selector did not direct volume migration to a specific secondary storage; using secondary storage with the most free capacity.
2026-01-26 16:59:21,417 DEBUG [c.c.s.StatsCollector] (Work-Job-Executor-7:[ctx-ae1d605d, job-46/job-47, ctx-cd5b4539]) (logid:99a6f633) Verifying image storage [ImageStore {"id":1,"name":"NFS:\/\/10.0.32.4\/acs\/secondary\/ref-trl-10723-k-Mol9-rositsa-kyuchukova\/ref-trl-10723-k-Mol9-rositsa-kyuchukova-sec1","uuid":"27cfbb80-8eda-4403-8ac8-9572c7edd2b7"}]. Capacity: total=[2.6357 TB], used=[1.5535 TB], threshold=[95.00%].
2026-01-26 16:59:22,732 DEBUG [o.a.c.s.m.AncientDataMotionStrategy] (Work-Job-Executor-7:[ctx-ae1d605d, job-46/job-47, ctx-cd5b4539]) (logid:99a6f633) MIGRATE copy using copyVolumeBetweenPools DONE: true
Status: PASSED
TC2: Cold Volume Migration WITH Secondary Storage Selector
Objective
Verify that cold volume migration uses the secondary storage specified by a VOLUME type heuristic selector rule, instead of the default "most free capacity" logic.
Test Steps
- Enable
js.interpretation.enabledsetting (encrypted value required) - Create a VOLUME type secondary storage selector directing volumes to sec2
- Deploy a test VM
- Create and attach a data volume
- Stop the VM
- Migrate the volume to a different primary storage
- Verify logs show the selector was used and volume routed through sec2
Expected Result:
- Selector is found and executed during cold migration
- Volume is routed through the secondary storage specified by the selector (sec2)
- Logs show "Found the heuristic rule" instead of "most free capacity" fallback
Actual Result:
- Selector was found and executed successfully
- Volume was routed through sec2 (
adb5c6f9-aa8f-4b61-8720-009ff4d34d35) as specified - Migration completed successfully
Test Evidence:
Environment: ref-trl-10733-k-Mol9-rositsa-kyuchukova
Build: cloudstack-management-4.20.3.0-shapeblue18781 (PR #10957)
1. Enable js.interpretation.enabled (encrypted value):
[root@ref-trl-10733-k-Mol9-rositsa-kyuchukova-mgmt1 ~]# cat /etc/cloudstack/management/key
password
[root@ref-trl-10733-k-Mol9-rositsa-kyuchukova-mgmt1 ~]# java -classpath /usr/share/cloudstack-common/lib/cloudstack-utils.jar com.cloud.utils.crypt.EncryptionCLI -p password -i true
HTN1bVjtD8q/KXJV4oDwxu7HOHipaYFJZIyyl9/Pyr4=
[root@ref-trl-10733-k-Mol9-rositsa-kyuchukova-mgmt1 ~]# mysql -u root -p'P@ssword123' cloud -e "UPDATE configuration SET value='HTN1bVjtD8q/KXJV4oDwxu7HOHipaYFJZIyyl9/Pyr4=' WHERE name='js.interpretation.enabled';"
2. Verify all 4 selector APIs available:
(localcloud) 🐱 > sync
Discovered 832 APIs
(localcloud) 🐱 > list apis 2>/dev/null | grep -i "name.*selector"
"name": "createSecondaryStorageSelector",
"name": "updateSecondaryStorageSelector",
"name": "listSecondaryStorageSelectors",
"name": "removeSecondaryStorageSelector",
3. List secondary storages:
(localcloud) 🐱 > list imagestores filter=id,name,zoneid,url
{
"count": 3,
"imagestore": [
{
"id": "6464b139-8b7c-4fca-aa8e-0f996d650f70",
"name": "NFS://10.0.32.4/acs/secondary/ref-trl-10733-k-Mol9-rositsa-kyuchukova/ref-trl-10733-k-Mol9-rositsa-kyuchukova-sec1",
"url": "NFS://10.0.32.4/acs/secondary/ref-trl-10733-k-Mol9-rositsa-kyuchukova/ref-trl-10733-k-Mol9-rositsa-kyuchukova-sec1",
"zoneid": "b9e63cf4-f908-4f15-8fd6-8faf821e0a03"
},
{
"id": "adb5c6f9-aa8f-4b61-8720-009ff4d34d35",
"name": "NFS://10.0.32.4/acs/secondary/ref-trl-10733-k-Mol9-rositsa-kyuchukova/ref-trl-10733-k-Mol9-rositsa-kyuchukova-sec2",
"url": "NFS://10.0.32.4/acs/secondary/ref-trl-10733-k-Mol9-rositsa-kyuchukova/ref-trl-10733-k-Mol9-rositsa-kyuchukova-sec2",
"zoneid": "b9e63cf4-f908-4f15-8fd6-8faf821e0a03"
},
{
"id": "545de8f9-8e9d-40d2-8bec-7e191ce16bde",
"name": "NFS://10.0.32.4/acs/secondary/ref-trl-10733-k-Mol9-rositsa-kyuchukova/ref-trl-10733-k-Mol9-rositsa-kyuchukova-sec3",
"url": "NFS://10.0.32.4/acs/secondary/ref-trl-10733-k-Mol9-rositsa-kyuchukova/ref-trl-10733-k-Mol9-rositsa-kyuchukova-sec3",
"zoneid": "b9e63cf4-f908-4f15-8fd6-8faf821e0a03"
}
]
}
4. Create VOLUME type selector directing to sec2:
(localcloud) 🐱 > create secondarystorageselector name="direct-volumes-to-sec2" description="directs all volumes to secondary storage 2" zoneid=b9e63cf4-f908-4f15-8fd6-8faf821e0a03 heuristicrule="'adb5c6f9-aa8f-4b61-8720-009ff4d34d35'" type=VOLUME
{
"heuristics": {
"created": "2026-01-27T08:11:21+0000",
"description": "directs all volumes to secondary storage 2",
"heuristicrule": "'adb5c6f9-aa8f-4b61-8720-009ff4d34d35'",
"id": "a05d8cca-db3c-43f0-93cd-dcbf98397a2e",
"name": "direct-volumes-to-sec2",
"type": "VOLUME",
"zoneid": "b9e63cf4-f908-4f15-8fd6-8faf821e0a03"
}
}
5. Verify selector created:
(localcloud) 🐱 > list secondarystorageselectors zoneid=b9e63cf4-f908-4f15-8fd6-8faf821e0a03 type=VOLUME
{
"count": 1,
"heuristics": [
{
"created": "2026-01-27T08:11:21+0000",
"description": "directs all volumes to secondary storage 2",
"heuristicrule": "'adb5c6f9-aa8f-4b61-8720-009ff4d34d35'",
"id": "a05d8cca-db3c-43f0-93cd-dcbf98397a2e",
"name": "direct-volumes-to-sec2",
"type": "VOLUME",
"zoneid": "b9e63cf4-f908-4f15-8fd6-8faf821e0a03"
}
]
}
6. Create data volume:
(localcloud) 🐱 > create volume name="tc2-test-volume" diskofferingid=6c67d0e1-b97c-4b01-9fb9-da63a3e703e6 zoneid=b9e63cf4-f908-4f15-8fd6-8faf821e0a03
{
"volume": {
"id": "c1ebc0e6-6e75-419b-b8ed-0dc17d914239",
"name": "tc2-test-volume",
"state": "Allocated",
...
}
}
7. Attach volume to VM:
(localcloud) 🐱 > attach volume id=c1ebc0e6-6e75-419b-b8ed-0dc17d914239 virtualmachineid=53b092ef-c61f-4588-8806-0c5ee5a79918
{
"volume": {
"id": "c1ebc0e6-6e75-419b-b8ed-0dc17d914239",
"name": "tc2-test-volume",
"state": "Ready",
"storage": "ref-trl-10733-k-Mol9-rositsa-kyuchukova-kvm-pri2",
"storageid": "8c76000e-2d06-3659-92bb-b462871e145a",
...
}
}
8. Stop VM:
(localcloud) 🐱 > stop virtualmachine id=53b092ef-c61f-4588-8806-0c5ee5a79918
{
"virtualmachine": {
"id": "53b092ef-c61f-4588-8806-0c5ee5a79918",
"name": "tc2-test-vm",
"state": "Stopped",
...
}
}
9. Verify volume location before migration:
(localcloud) 🐱 > list volumes id=c1ebc0e6-6e75-419b-b8ed-0dc17d914239 filter=id,name,storage
{
"count": 1,
"volume": [
{
"id": "c1ebc0e6-6e75-419b-b8ed-0dc17d914239",
"name": "tc2-test-volume",
"storage": "ref-trl-10733-k-Mol9-rositsa-kyuchukova-kvm-pri2"
}
]
}
10. Migrate volume to pri1:
(localcloud) 🐱 > migrate volume volumeid=c1ebc0e6-6e75-419b-b8ed-0dc17d914239 storageid=9beb2f30-8367-383f-b1e7-7dcbd0834823
{
"volume": {
"id": "c1ebc0e6-6e75-419b-b8ed-0dc17d914239",
"name": "tc2-test-volume",
"state": "Ready",
"storage": "ref-trl-10733-k-Mol9-rositsa-kyuchukova-kvm-pri1",
"storageid": "9beb2f30-8367-383f-b1e7-7dcbd0834823",
...
}
}
11. Management server logs showing selector in action:
2026-01-27 08:21:08,992 DEBUG [o.a.c.s.h.HeuristicRuleHelper] (Work-Job-Executor-4:[ctx-94df78f8, job-47/job-48, ctx-03272f71]) (logid:e2693615) Found the heuristic rule Heuristic {"heuristicRule":"'adb5c6f9-aa8f-4b61-8720-009ff4d34d35'","id":1,"name":"direct-volumes-to-sec2","type":"VOLUME","uuid":"a05d8cca-db3c-43f0-93cd-dcbf98397a2e"} to apply for zone [Zone {"id": "1", "name": "ref-trl-10733-k-Mol9-rositsa-kyuchukova", "uuid": "b9e63cf4-f908-4f15-8fd6-8faf821e0a03"}].
2026-01-27 08:21:09,280 DEBUG [o.a.c.u.j.JsInterpreter] (Work-Job-Executor-4:[ctx-94df78f8, job-47/job-48, ctx-03272f71]) (logid:e2693615) Executing script ['adb5c6f9-aa8f-4b61-8720-009ff4d34d35'].
2026-01-27 08:21:09,391 DEBUG [o.a.c.u.j.JsInterpreter] (Work-Job-Executor-4:[ctx-94df78f8, job-47/job-48, ctx-03272f71]) (logid:e2693615) The script ['adb5c6f9-aa8f-4b61-8720-009ff4d34d35'] had the following result: [adb5c6f9-aa8f-4b61-8720-009ff4d34d35].
2026-01-27 08:21:09,448 DEBUG [c.c.a.t.Request] (Work-Job-Executor-4:[ctx-94df78f8, job-47/job-48, ctx-03272f71]) (logid:e2693615) Seq 1-511721507659972652: Sending { Cmd , MgmtId: 32987093664357, via: 1(ref-trl-10733-k-Mol9-rositsa-kyuchukova-kvm1), Ver: v1, Flags: 100011, [{"org.apache.cloudstack.storage.command.CopyCommand":{"srcTO":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"c1ebc0e6-6e75-419b-b8ed-0dc17d914239","volumeType":"DATADISK","dataStore":{"org.apache.cloudstack.storage.to.PrimaryDataStoreTO":{"uuid":"8c76000e-2d06-3659-92bb-b462871e145a","name":"ref-trl-10733-k-Mol9-rositsa-kyuchukova-kvm-pri2"...}},...}},"destTO":{"org.apache.cloudstack.storage.to.VolumeObjectTO":{"uuid":"c1ebc0e6-6e75-419b-b8ed-0dc17d914239","volumeType":"DATADISK","dataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"NFS://10.0.32.4/acs/secondary/ref-trl-10733-k-Mol9-rositsa-kyuchukova/ref-trl-10733-k-Mol9-rositsa-kyuchukova-sec2","_role":"Image"}},...}},...}}] }
Key log evidence:
Found the heuristic rule Heuristic {"heuristicRule":"'adb5c6f9-aa8f-4b61-8720-009ff4d34d35'","name":"direct-volumes-to-sec2","type":"VOLUME"}- Selector was foundExecuting script ['adb5c6f9-aa8f-4b61-8720-009ff4d34d35']- JS rule executedThe script...had the following result: [adb5c6f9-aa8f-4b61-8720-009ff4d34d35]- Rule returned sec2 UUID"destTO":..."dataStore":{"com.cloud.agent.api.to.NfsTO":{"_url":"NFS://...sec2"- Volume routed through sec2
Result: PASSED
|
@blueorangutan test |
1 similar comment
|
@blueorangutan test |
|
@RosiKyu a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests |
|
@winterhazel , let’s wait for regression test results and/before merge? |
Let's merge after the tests |
|
[SF] Trillian test result (tid-15292)
|
Description
The secondary storage selectors allow operators to specify, for instance, that volumes should go to a specific secondary storage A. Thus, when uploading a volume, it will always be downloaded to secondary storage A.
The cold volume migration moves volumes to a secondary storage before moving them to the destination primary storage. This process does not consider the secondary storage selectors. However, some companies want to dedicate specific secondary storages for cold migration.
To address this, this PR makes the cold volume migration process consider the secondary storage selectors.
Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
How Has This Been Tested?
Without any secondary storage selector, I began the cold migration of a volume. I validated that the most free secondary storage was used for migration.
I created a secondary storage selector directing volumes to a specific secondary storage, and began the cold migration of another volume. I validated that the specified secondary storage was used for the migration.